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Abstract 

Background: The classical view on eukaryotic gene expression proposes the scheme of a forward flow for which 
fluctuations in mRNA levels upon a stimulus contribute to determine variations in mRNA availability for translation. 
Here we address this issue by simultaneously profiling with microarrays the total mRNAs (the transcriptome) and 
the polysome-associated mRNAs (the translatome) after EGF treatment of human cells, and extending the analysis 
to other 19 different transcriptome/translatome comparisons in mammalian cells following different stimuli or 
undergoing cell programs. 

Results: Triggering of the EGF pathway results in an early induction of transcriptome and translatome changes, but 
90% of the significant variation is limited to the translatome and the degree of concordant changes is less than 5%. 
The survey of other 19 different transcriptome/translatome comparisons shows that extensive uncoupling is a general 
rule, in terms of both RNA movements and inferred cell activities, with a strong tendency of translation-related genes 
to be controlled purely at the translational level. By different statistical approaches, we finally provide evidence of the 
lack of dependence between changes at the transcriptome and translatome levels. 

Conclusions: We propose a model of diffused independency between variation in transcript abundances and variation 
in their engagement on polysomes, which implies the existence of specific mechanisms to couple these two ways of 
regulating gene expression. 

Keywords: Polysomal, Profiling, Transcriptome, Translational, Control, Translatome 



Background orchestrated by incoming cell stimuli which elicit largely 

In the flow of genetic information, translational control is unknown transduction pathways, affecting primarily 

the level at which reprogramming of cell activities accesses translation initiation, i.e. the loading of ribosomes on mes- 

the phenotype, ultimately shaping protein synthesis and senger ribonucleoprotein particles (mRNP) to form poly- 

therefore, together with the control of protein degrad- somes, and secondarily translation elongation [3,4]. The 

ation, quantitative variation of the proteome. Originally ways in which these stimuli influence polysome formation 

studied in early stages of development in oocytes and involve "general" translation factors as eIF4E, eIF4G, 

embryos [1,2], translational control has been increasingly eIF4A and PABP, allowing mRNA circularization and ribo- 

recognized as a very general feature of eukaryotic cells, some scanning, and more specialized factors acting on 

extensively present also in mature tissues. This process is sequences found primarily in the 5' or 3' untranslated 

regions (UTRs) of mRNAs. These latter factors belong to 

— . . . the two classes of RNA binding proteins (RBPs) and 
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human genome the predicted genes coding for proteins 
involved in translational control are around a thousand 
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and the number of miRNAs, proven to be able to modu- 
late translation [5,6], is estimated between one and two 
thousands [7]. Furthermore, by recent transcriptome 
high-sensitivity sequencing scannings, the human ncRNA 
collection has risen to comprise around five thousands 
ncRNAs [8], to which the 18,000 [9] processed pseudo- 
genes have to be added because they also can interfere 
with gene expression [10]. If even a small fraction of these 
ncRNAs was involved in modulating translation, the 
amount of macromolecules potentially able to operate at 
the interface between mRNA and proteins would be 
extremely high. Moreover, recent findings reveal the pres- 
ence in eukaryotic cells of cytoplasmic RNA-containing 
granules (processing bodies, stress granules and other 
types) composed of aggregates of mRNPs where mRNA 
decay, editing and storage can take place [11-13]. These 
granules can generate a bidirectional flow of mRNAs with 
polysomes [14-16]. 

Given this complex layer of activities in the cytoplasm, 
we set the goal to estimate the relationship between fluc- 
tuations of mRNA levels in the cell and fluctuations of the 
fraction of mRNAs available for translation after a stimulus, 
which to our knowledge has never been addressed with a 
population-based approach. The degree of change in trans- 
lation-engaged mRNAs can be estimated by extracting 
mRNAs organized in polysomes by a classical separation 
technique, velocity sedimentation by sucrose gradients, and 
profiling them in parallel with total mRNA [17]. 

By measuring the total mRNAs of cells (the transcrip- 
tome) and the polysomally-loaded mRNAs (the transla- 
tome) after a growth stimulus, we obtained a picture of 
overall mismatching between the two changes for the 
majority of genes, to which we refer as "uncoupling" in the 
mRNA behavior. This was confirmed studying a number 
of other available profiles coming from very diverse experi- 
ments and kinetics. The marked, general uncoupling 
between transcriptome and translatome gene expression 
changes allowed us to propose a biological model by which 
the machineries responsible for mRNA availability in the 
cytoplasm and for mRNA engagement in translation lack 
overall dependency, therefore questioning the notion of 
continuity in the control of the flow of gene expression. 

Results 

Profound uncoupling between transcriptome and 
translatome gene expression variations upon EGF 
stimulation of HeLa cells 

To address the impact of translational regulation in 
reshaping transcriptome profiles we chose a classical para- 
digm of cellular reprogramming of gene expression, Epi- 
dermal Growth Factor (EGF) treatment of starved cells. 
This stimulus elicits a well-known chain of intracellular 
transduction events, resulting in a complex phenotypic 
spectrum of changes with prevalent induction of cell 



growth and proliferation [18,19]. As outlined in Figure 1A, 
we treated HeLa cells under serum starvation with EGF 
for 40 minutes (final concentration of 1 ug/ml). The acti- 
vation of the EGF signalling cascade is proved by an 
increased phosphorylation of AKT and ELK1, known 
EGFR downstream effectors [20,21], and by an increase of 
MYC, an early EGF transcriptional target [22] (Figure IB). 
Consistently with an overall engagement of the transla- 
tional machinery by EGF, the absorbance profiles obtained 
after sucrose gradient centrifugation of lysates from EGF- 
treated compared to control cells show a clear increase of 
RNA associated to the polysomal fractions and a con- 
comitant reduction of RNA present in the subpolysomal 
portion of the gradient (Figure 1C). We then profiled by 
gene expression arrays both the transcriptome and the 
translatome, before and after 40 minutes of EGF treat- 
ment. Microarray results were validated with quantitative 
real time PCR on a selected subset of twelve genes, show- 
ing a good concordance between the two independent sets 
of measurements (Figure 1F-G, in Additional file 1: Table 
SI): Pearson correlation was 0.82 for transcriptome data 
and 0.88 for translatome data. Differentially expressed 
genes (DEGs) upon EGF treatment were detected from 
microarray data with the RankProd algorithm [23] separ- 
ately at the transcriptome and translatome level. This 
allowed us to obtain a simple classification of DEGs into 
"coupled" or "uncoupled", based on the concordance of 
their variation between the transcriptome and the transla- 
tome (Figure 1A). We consider the DEGs coupled if they 
show a significant change in both the transcriptome and 
the translatome and if the change is homodirectional 
(always displayed in green in Figure 1A, ID and IE). They 
are instead scored as uncoupled if (a) they change signifi- 
cantly in both the transcriptome and the translatome but 
in an antidirectional way (always displayed in red through- 
out the paper), (b) they change significantly only in the 
transcriptome (always displayed in cyan) and (c) they 
change significantly only in the translatome (always dis- 
played in yellow). Following these criteria, the proportion 
of coupled DEGs observed in our experiment is only 4.8% 
(37 genes), against the overwhelming 95.2% proportion of 
uncoupled DEGs (665 genes; Figure IE, Additional file 2). 
Furthermore, among the uncoupled DEGs, purely transla- 
tome DEGs are nine times more frequent than purely 
transcriptome DEGs (597 against 64) and transcriptome 
DEGs result to be exclusively upregulated. Plotting trans- 
latome versus transcriptome fold changes makes clear that 
the variations in mRNA abundance are poorly correlated 
with the variations in mRNA polysomal engagement 
(Figure ID). Therefore, treatment of HeLa cells with a 
well-known growth factor results to target mostly transla- 
tion, with a negligible concordance between the two levels 
of regulation. We next sought to determine if the 
observed differences between the two profiles were also 
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Figure 1 EGF treatment of HeLa cells induces extensive uncoupling between transcriptome and translatome gene expression 
variations. (A) Flowchart of differential expression analysis between transcriptome and translatome after EGF treatment and definition of 
uncoupling. Uncoupling qualifies genes classified as DEGs (differentially expressed genes) with significant variations only in the transcriptome (in 
cyan), only in the translatome (in yellow) and with opposite significant variations between transcriptome and translatome (in red). Coupling 
qualifies genes classified as differentially expressed (DEGs) by both transcriptome and translatome profile comparisons and with homodirectional 
changes (in green). (B) Western blots indicating the activation of the EGFR signaling pathway by the increase of known EGFR mediators and 
targets: phosphorylated Akt1, phosphorylated Elkl and Myc. (C) Comparison between sucrose gradient profiles of HeLa cells without EGF (in 
black) and with EGF (in red). (D) Scatterplot of transcriptome and translatome log2 transformed fold changes, showing genes belonging to the 
coupling and uncoupling categories as defined in panel A. Spearman correlation between fold changes is also shown. (E) Barplot highlighting 
the uncoupling value between translatome and transcriptome DEGs. The number of DEGs and the corresponding percentages are displayed 
following the same colour scheme adopted in the rest of the figure (F-G) Scatterplot showing correlation between transcriptome (F) and 
translatome (G) log2 transformed fold changes derived from microarray hybridizations and quantitative RT-PCR on a set of twelve genes, 
displayed as black dots. Regression lines are drawn in grey. 
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reflected in variations of predicted cellular processes and 
activities. DEGs were annotated by sequence, protein 
domain, phylogenetic and functional descriptors: PIR 
resource [24], Inter Pro database [25], COG database [26], 
KEGG [27] and Biocarta pathway databases, Gene Ontol- 
ogy [28]. The high degree of uncoupling was confirmed by 
enrichment analysis of the transcriptome and translatome 
DEGs, resulting in sharply distinct patterns of significant 
terms, with only 27 common terms (17%), 90 transcrip- 
tome-specific terms and 43 translatome-specific terms 
(Additional file 1: Figure SI and Additional file 3). 

The high degree of uncoupling between transcriptome 
and translatome variation profiles is a general feature of 
the control of gene expression in mammalian cells 

To test whether our observation of strong discordance 
between the variations of total mRNAs and polysome- 
associated mRNAs could be of some generality in mam- 
mals, we systematically reanalyzed already published 
experiments in which both the transcriptome and the 
translatome (the last always isolated by sucrose gradient) 
were profiled in mammalian cells and tissues. We selected 
the experiments according to stringent quality standards 
(see Methods) to ensure technical comparability between 
different studies. Among an initial database of 16 mam- 
malian studies, we finally identified 10 experiments involv- 
ing observation of different treatments and processes in 
human, mouse and rat cells and tissues, giving a total of 
19 paired transcriptome/translatome datasets. The profiles 
belonged to three types of experiments: short-term treat- 
ments with extracellular stimuli (4 experiments, 6 paired 
datasets), differentiation processes in cells and tissues 
(3 experiments, 8 paired datasets) and induced genetic 
alterations of the translational machinery (4 experiments, 
5 paired datasets). The experiments are briefly described 
in Table 1 and extensively annotated in Additional file 4. 
All the microarrays used in the experiments belong to the 
Affymetrix platform: this decreases the risk of introducing 
in the following analyses cross -platform biases due to 
different manufacturing technologies (Additional file 1: 
Table S2 and Figure S3). Raw microarray data were sub- 
jected to the same normalization and DEGs selection pro- 
cedure previously described for the EGF experiment 
(processed data in Additional file 5). To measure the 
significance of differential expression, we chose the Rank- 
Prod algorithm because, transforming the actual expres- 
sion values into ranks, it offers a way to overcome the 
heterogeneity among multiple datasets and therefore to 
extract and integrate information from them [23] . In order 
to keep a methodological homogeneity, we also chose to 
apply for all the datasets the same significance threshold. 
To quantify the transcriptome/translatome uncoupling for 
each paired dataset, we calculated the percentage of 
uncoupled DEGs, which outnumbered coupled DEGs in 



two thirds of the analyzed datasets (14 out of 19 compari- 
sons, Figure 2A) the percentage of uncoupled DEGs ran- 
ging from 43.2% to 89.7% with an average of 64.8%. 
Conversely, the percentage of coupled DEGs ranges from 
a minimum of 10.3% to a maximum of 57.4%, with an 
average of 35.2%. Importantly, these relative proportions 
between uncoupled and coupled DEGs are stable even 
when using different significance thresholds to identify 
DEGs, or alternative DEG detection methods (Figure 2B 
and in Additional file 1: Figure S2). As alternatives we 
used £-test and SAM [29], by which we can show an even 
more extensive uncoupling than by RankProd. Therefore, 
this broad analysis confirmed that the marked uncoupling 
between transcriptome and translatome profiles is a fea- 
ture far from being confined to short-time treatment of 
HeLa cells with EGF, assuming instead the dimension of a 
general principle describing change of gene expression in 
mammals. 

Ontological enrichment and pathway analysis of 
transcriptome and translatome variations predict very 
different phenotypes 

We were then interested in estimating the impact of gene 
expression uncoupling on the cell activities ascribed to the 
transcriptome and the translatome DEGs, when studying 
the whole collection of experiments. All the lists of DEGs 
from the dataset pairs were independently subjected to 
ontological enrichment analysis as for our EGF experiment 
(data available in Additional file 6). We tested whether the 
gene expression uncoupling between transcriptome and 
translatome can originate a semantic specificity between 
the two relative sets of enriched ontological terms. Two 
measures of semantic specificity were adopted. The first 
measure is based on the simple enumeration of cell activ- 
ities that, as an effect of uncoupling, resulted enriched 
uniquely in the transcriptome or in the translatome DEGs 
(Figure 3A, color code of the boxplot). Transcriptome spe- 
cificity is higher (87%) than translatome specificity in the 
large majority of dataset pairs, except for three of them 
related to short-term cell treatments. The second measure 
of semantic specificity accounts also for semantic similarity 
relationships between not identical ontological terms (see 
Methods), and was applied to all the dataset pairs (red bars 
in Figure 3A). Semantic specificities were low, with an 
average value of 0.26 and with 16 dataset pairs falling 
below the midrange value of 0.5. To further estimate the 
extent of the distance between the transcriptome and the 
translatome of each experiment, we compared the seman- 
tic specificity measures with a reference distribution, 
calculated as the set of semantic specificities between the 
transcriptome of each dataset pair and the transcriptome 
of all the other datasets. Since the datasets collected were 
largely heterogeneous, they were assumed to show a low 
semantic relationship between their transcriptome DEGs. 



Tebaldi et al. BMC Genomics 2012, 13:220 
http://www.biomedcentral.eom/1 471 -21 64/1 3/220 



Page 5 of 15 



Table 1 Description of the datasets used for the analysis 



Surprisingly, the semantic specificity observed between 
the transcriptome and the translatome in all the dataset 
pairs except one was found within or below the distribu- 
tion, and in 13 of them below the distribution median 
(Figure 3A). Taken together, the results show unexpect- 
edly weak semantic similarity between the transcrip- 
tome and the translatome ontological enrichments of all 
the considered experiments. 

Finally, we wanted to derive from the global onto- 
logical analysis those cell activities more specifically 
characterizing transcriptome DEGs compared to trans- 
latome DEGs and vice versa. To provide a general view, 
individual over-represented GO terms from all dataset 
pairs were mapped to GOslim [30], a simplified version 
of GO. A clear outcome was that half of the transla- 
tome-specific terms (including translation, translation 
regulator activity, translation factor activity, ribosome) 
were exclusively translation-related (Figure 3B). This 
result provides additional support to the notion of inde- 
pendent transcriptome and translatome controls of gene 
expression variations. 



signalling 



machinery 
tion 



For each dataset, lists of transcriptome and translatome 
DEGs were subjected to further annotation with the In- 
genuity Pathway Analysis (IPA) library of canonical path- 
ways (data available in Additional file 7). The significance 
of the association between the DEGs and the canonical 
pathways was measured with the Fisher s exact test, and a 
0.05 cut-off on the Benjamini-Hochberg corrected p-value 
was used to identify significantly enriched pathways. Com- 
paring the number of pathways that resulted enriched 
uniquely in the transcriptome or in the translatome DEGs, 
we had another proof that the gene expression uncoupling 
between transcriptome and translatome is extended to a 
functional specificity between the two relative sets of 
enriched pathways (Additional file 1: Figure S5). Across all 
the dataset pairs, 97 pathways (22%) were significantly 
enriched only in transcriptome DEGs, 139 pathways (31%) 
only in translatome DEGs and 206 pathways (47%) in both 
transcriptome and translatome DEGs. In 14 out of the 16 
datasets with at least one enriched pathway, the number 
of specific pathways exceeds the number of common 
pathways. 



Short name 3 



Description 



Biological source Reference Data ID b 



Chip c 



Clusl 



+serum.0-2 h 


serum starvation release 


Mus musculus 


PMID 


1 7405863 


GSE7363 


MG_U74Av2 


+EPO.0-2 h 


erythroid EPO deprivation release 


Mus musculus 


PMID 


18625885 


E-MEXP-1689 


MG_U74Av2 


-LIF.0-5d 


stem cell differentiation through 
LIF removal 


Mus musculus 


PMID 


18462695 


GSE9563 


Mouse430_2 


+LPS.0-1 h 


macrophage LPS treatment (1 h) 


Mus musculus 


PMID 


18230670 


GSE4288 


Mouse430_2 


+LPS.0-2 h 


macrophage LPS treatment (2 h) 


Mus musculus 


PMID 


18230670 


GSE4288 


Mouse430_2 


+LPS.0-4 h 


macrophage LPS treatment (4 h) 


Mus musculus 


PMID 


18230670 


GSE4288 


Mouse430_2 


+diff.WT.hepa 


differentiation of WT 
hepatocytes 


Homo sapiens 


PMID 


18221535 


E-MEXP-958 


HG-U133A 


+diff.mTOR.hepa 


differentiation of mTOR 
activated hepatocytes 


Homo sapiens 


PMID: 17483347 


E-MEXP-958 


HG-U133A 


+diff.testis.P17-P22 


testis differentiation (5d) 


Mus musculus 


PMID 


16682651 


GSE471 1 


MOE430A 


+diff.testis.P17-P70 


testis differentiation (53d) 


Mus musculus 


PMID 


16682651 


GSE4711 
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lung differentiation (3d) 
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lung differentiation 
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proliferative hepatocytes 

mTOR activation of 
differentiated hepatocytes 



Homo sapiens 
Homo sapiens 
Homo sapiens 
Homo sapiens 



PMID 
PMID: 
PMID 
PMID 



1 7638893 
18426977 



GSE6043 
GSE11011 



16446406 E-MEXP-461 
17483347 E-MEXP-958 



HG-U133_Plus_2 translational 
HG-U133A_2 a ' terat 
HG_U95Av2 
HG-U133A 



Homo sapiens PMID: 17483347 E-MEXP-958 HG-U133A 

agent name, experimental time. 



(a) short name specifying exposure to or subtraction from (+ or -) a broadly defined perturbation agent, perturbation 

(b) dataset reference on GEO or ArrayExpress. 

(c) all the chips belong to the Affymetrix platform. 
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Figure 2 Widespread gene expression uncoupling is a general and recurring phenomenon in all transcriptome-translatome profiling 
datasets. (A) Barplot displaying the degree of uncoupling between transcriptome and translatome DEGs for each dataset. Collected datasets are 
labelled by short names as explained in Table 1. Bar lengths show the relative proportion of DEGs in the four classes defined in Table 1. The 
corresponding percentages of uncoupled DEGs are shown on the right. (B) Uncoupling estimate is independent from the significance threshold 
and the algorithm used for calling DEGs. Percentage of DEGs detected by the comparison (homodirectional change in green, antidirectional 
change in red) between both transcriptome and translatome profiles, DEGs detected by the transcriptome comparison only (in cyan) and DEGs 
detected by the translatome comparison only (in yellow) were computed over all the datasets described in Table 1. Three algorithms are shown: 
RankProd, t-test and SAM. Inside each barplot the significance thresholds ranges from 0.01 to 0.5. In the barplot generated with RankProd the red 
vertical dashed line indicates the 0.2 significance threshold used to detect DEGs throughout the analysis. For t-test and SAM a Benjamini- 
Hochberg multiple test correction was applied to the resulting p-values. 



The Ingenuity Knowledge Base was employed to 
build networks from the lists of transcriptome and 
translatome DEGs for each dataset. Networks were 
generated using experimentally validated direct inter- 
actions among DEGs (data available in Additional file 
8). Cellular functions associated to networks, based on 



the functional annotation of their genes, were ranked 
according to their translatome specificity (Additional 
file 1: Table S3). RNA post-transcriptional modifica- 
tion, again an mRNA related theme, resulted as a 
cellular function mainly associated to translatome 
networks. 
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Transcriptome and translatome variations are globally 
not dependent 

Having shown the high level of uncoupling between 
transcriptome and translatome variations by either a 
gene-oriented and a function-oriented perspective, we 
speculate that these variations could be controlled by 



largely independent regulatory mechanisms. If con- 
firmed, this hypothesis would falsify the conventional 
model of gene expression change where transcriptome 
fluctuations induced by regulated mRNA synthesis or 
degradation are implicitly considered determinants of 
translatome changes, through "mass effects" of increased 
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Figure 3 (See legend on next page.) 
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(See figure on previous page.) 

Figure 3 Uncoupling between transcriptome and translatome is conserved in the enriched biological themes. (A) Summary of semantic 
specificity estimates (based on the optimized quantification of semantic specificity described in SI Materials and Methods). Red dotted lines 
represent semantic specificity estimates relative to the transcriptome and translatome comparisons within all datasets. Box and whisker plots 
show the reference distributions of semantic specificities (whiskers indicating minimal and maximal distribution values), characteristic of each 
dataset and reflecting semantic specificity estimates between the transcriptomes of unrelated dataset pairs. A semantic specificity falling within or 
below the reference distribution is indicative of very poor semantic similarity between the transcriptome and the translatome in a dataset pair. 
The color associated to the box of each dataset pair corresponds to the normalized difference between the number of GO terms over- 
represented only at the translatome level and the number of GO terms over-represented only at the transcriptome level, a quantity ranging from 
-1 (all the terms are enriched only at the transcriptome level, in blue) to 1 (all the terms are enriched only at the translatome level, in yellow). 
This measure is positive for the first three datasets on the left and negative for all the others (divided by a vertical dashed line in the figure). 
Having no overrepresented ontological terms, the dataset + mTOR.diff is not displayed. (B) For each GO term the transcriptome and translatome 
specificity degrees are calculated as the ratio between the number of datasets in which the term is transcriptome or translatome specific and the 
number of datasets in which the term is overrepresented. Terms are grouped into the broader GOslim categories and the median specificity 
values are calculated. The number of GO terms grouped in each GOslim category is specified in round brackets. Within each of the three GO 
domains (from left to right: Biological Process, Cellular Component and Molecular Function), categories are sorted from the most translatome- 
specific (in yellow) to the most transcriptome-specific (in blue). 

V J 



or decreased mRNA quantities on polysomal occupancy 
[31]. Indeed, the results of three different statistical tests 
carried out on the available DEG profiles support a 
counterintuitive model of transcriptome and translatome 
relative autonomy (Figure 4). The conventional depend- 
ency model reasonably generates the following expecta- 
tions: (1) the total number of translatome DEGs should 
be dependent on the total number of transcriptome 
DEGs, (2) significant variations of expression of a gene 
in the transcriptome should be reflected in the transla- 
tome, and therefore transcriptome DEGs should overlap 
translatome DEGs in a statistically significant manner. 
Neither expectation was confirmed by our analysis. In 
fact, the likelihood ratio test clearly rejected the first 
expectation, by supporting the notion that the numbers 
of transcriptome and translatome DEGs are independent 
in 17 out of the 19 comparisons (Figure 4A). Further- 
more, when we tested the second expectation, we found 
the observed overlap between transcriptome and transla- 
tome DEGs to be comparable with the overlap deriving 
from random sampling of gene variations of expression, 
never passing a 0.01 p- value threshold for significance 
by standard non-parametric bootstrap (Figure 4B). To 
further assess this strong indication of independence, we 
finally estimated the mutual information between tran- 
scriptome and translatome variations, modeled as binary 
variables. Across all comparisons mutual information 
values ranged from 0.02 to 0.21, with an average value of 
0.09. When we took into account the minimal and max- 
imal mutual information values allowed by the frequen- 
cies of DEGs in each dataset pair (corresponding 
respectively to the event of null overlap and complete 
overlap between transcriptome and translatome DEGs), 
the observed mutual information values were not found 
to deviate from the overall midrange values (mean abso- 
lute deviation 0.08). The lack of substantial mutual 
dependence between transcriptome and translatome 
DEGs was confirmed by the fact that the observed 



mutual information values never significantly exceed the 
corresponding values in random bootstrapping samples 
(0.01 significance threshold; Figure 4C). 

Past studies employing yeast and reticulocyte lysates 
[32-35] have claimed that mRNAs have to numerically 
compete to gain access to ribosomes and to form poly- 
somes. According to this view, polysomes should gen- 
erally buffer transcriptome variations, except for those 
transcripts associated to trans-acting factors specific- 
ally increasing their probability of access to polysomes. 
To verify this hypothesis in our mammalian datasets, 
we counted the number of translationally enhanced 
(polysomal mRNA fold change > total mRNA fold 
change) and translationally buffered (polysomal mRNA 
fold change < total mRNA fold change) mRNAs across 
all the 19 dataset pairs. Since the proportions are 
roughly 50% and 50% (46% with decreased polysomal 
access, 54% with increased polysomal access) without 
a significant majority of genes buffered at the polyso- 
mal level, we suggest that in mammalian cells the 
competition of mRNA for ribosomes is not a general 
driving force regulating translation, unless the action 
of trans factors promoting polysome formation has the 
same magnitude of the polysomal competition effect. 

In conclusion, we suggest that by analyzing the available 
data with different approaches, the mRNA production/ 
degradation and the mRNA access to translation appear 
to be globally regulated not in an interdependent way. 

Discussion 

The conceptualization which framed molecular genetics 
studies for four decades is the so-called central dogma 
[36], representing the forward flow of gene expression 
from DNA to mRNA to proteins through transcription 
and translation. This directional flow can easily be viewed 
as an assembly line in which the translation step is auto- 
matically determined by the availability of mRNAs pro- 
duced by the transcription step. Following this scheme, 
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Figure 4 Gene expression uncoupling is consistent with a hypothesis of lack of dependence between transcriptome and translatome 
variations. Results in agreement with the lack of dependence hypothesis are labeled with a green square, while results rejecting the lack of 
dependence hypothesis are labeled with a red square. (A) Likelihood Ratio p-values, testing the hypothesis that the numbers of DEGs in the 
transcriptome and the translatome are different, result significant for 17 of 19 datasets (P < 0.01). (B) The overlap observed between 
transcriptome and translatome DEGs is never significantly higher than its random estimate (random overlap P > 0.01 in 19 out of 19 datasets). 
(C) Mutual information observed between transcriptome and translatome is never significantly higher than its random estimate (random mutual 
information P > 0.01 in 19 out of 19 datasets). Theoretical mutual information minima and maxima are also calculated for each dataset as 
explained in Methods. The positions of the real mutual information values inside the range defined by the theoretical minima and maxima are 
visualized as grey histograms. 



changes in the quantities of an mRNA species due to 
changes in its transcription and/or degradation rate after a 
stimulus determine changes in its translation rate. 

Prompted by the recent appreciation of translational 
control being more widespread than originally thought 
[37], and by the discovery of universal cytoplasmic foci 
of mRNA accumulation [38], in this work we wanted to 
address the quantitative population changes induced by 
a cell stimulus between the cytoplasmic mRNAs and 
those mRNAs supposed to be actively engaged in trans- 
lation because part of polysomes. Inspired by the results 
of an experiment of EGF treatment on HeLa cells, we 
extended the analysis to a number of available published 
data on mammalian systems, all obtaining the transla- 
tome measures after sucrose gradient separation of poly- 
somes. Therefore the outcome proposed, deriving from 
8 treatments and 2 developmental assays performed on 
10 different types of mammalian cells, can be likely 
regarded as general for mammals. 

We found that the degree of uncoupling between the 
transcriptome and the translatome was higher than the de- 
gree of coupling, both in terms of single transcripts under- 
going changes in levels and in terms of the ontological 



enrichment of the corresponding proteins. A striking result 
of the ontological analysis is that the transcriptome vari- 
ation profiles of two different, unrelated experiments are as 
diverse as each of them compared with the corresponding 
translatome variations of the same experiment (Figure 3A). 
From this result we derive a message of partiality of tran- 
scriptome data in representing cell phenotypes, no matter 
how much quantitatively accurate. We also observed that a 
general tendency to establish autoregulatory or crossregula- 
tory loops should be a specific feature of mRNAs and pro- 
teins involved in translation. In fact, among the mRNAs 
that change their abundance purely at the translatome level, 
we observe a strong enrichment in encoded translation- 
related proteins. This finding extends previous observations 
of self-regulating translation activities, such as the well- 
known example of the TOP genes, that are involved in the 
translation basal machinery and at the same time are regu- 
lated at the translational level after cell growth stimuli [39]. 
This observation provides a first clue for understanding the 
independence of translatome changes by the transcriptome: 
following cell stimuli which act on their expression, many 
genes coding for components of the translational machin- 
ery do not undergo any change in their mRNA levels, but 
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only variations in the rate of polysomal loading of these 
mRNAs. Moreover, we know that several mammalian RBPs 
bind their own mRNA and the mRNAs encoding other 
RBPs, regulating their stability and translation [40]. In 
agreement, reconstruction in the yeast of RBP expression 
networks suggests that at least one third of the studied pro- 
teins post-transcriptionally auto regulate themselves, acting 
as network hubs [41]. In metazoan a possible strong evolu- 
tionary pressure for the wiring of these RBP-mediated post- 
transcriptional looping circuits could derive from the need 
to regulate protein synthesis of maternal mRNAs in oocytes 
and in early embryo development, a well-studied process in 
C. elegans and Drosophila [42]. A stimulating framework to 
explain the complexity and the relative independence of 
post-transcriptional networks from transcriptional events is 
proposed by Keene with the concept of post-transcriptional 
regulons, clusters of discrete mRNAs co-regulated by the 
same set of RBPs in order to orchestrate complex cellular 
functions [43,44]. 

Indirect observations sustaining the view of divergence 
between the transcriptome and the translatome come 
instead from en-masse analyses comparing absolute 
mRNA and protein abundance. While in many prokary- 
otic and eukaryotic systems mRNA levels can describe 
no more than 50% of protein levels [45,46], in a recent 
work [47] on human tumor cells the matching is lower 
(Pearson correlation 0.29) and translation-related fea- 
tures (as coding sequence, 5' UTR, 3' UTR lengths, pres- 
ence of upstream open reading frames, density of 
secondary structures in the 5' UTR, amino-acid compos- 
ition) contribute with 30% to the predictability of protein 
concentration. Similar conclusions, with a predominant 
role in control given to translation, were drawn from a 
quantitative model based on mRNA and protein abun- 
dances, synthesis rates and half-lives in mouse fibro- 
blasts and human breast cancer cells [48]. Protein 
abundances are also more conserved than total mRNA 
abundances among different taxa, suggesting that tran- 
scriptome networks are less affected by evolutionary 
pressure than proteome networks [49]. 

Direct transcriptome to translatome comparison studies 
after severe stresses in yeast provide a picture of a marked 
translational shutdown from which highly concordant 
homodirectional changes emerge for some genes [50-53]. 
Exposing yeast to mild stresses, instead, produces a 
response characterized by a high level of uncoupling [53] . 
These last mild perturbations and their dynamic effects 
can be better assimilated to the stimuli analyzed here in 
mammalian cells. A statistical treatment of the popula- 
tions of mRNAs undergoing variations in our collection of 
comparable dataset pairs provides the falsification of a 
model of straight dependency between translatome 
changes from transcriptome changes. As a possible alter- 
native, a parsimonious model in line with these data could 



postulate a general orthogonality between the mechanisms 
controlling transcript levels and translation in mammalian 
cells. In other words, changes in abundance of a given 
mRNA do not determine per se any effect on changes in 
its polysomal engagement. While mRNA abundance is 
controlled in a sequence-dependent way by its rate of 
transcription and degradation, polysomal engagement of 
mRNA is determined by translation factors interacting 
with sequence and structural motifs present in the mRNA 
itself [54]. These controls do not depend on mRNA abun- 
dance if not following titration of the trans-acting factors 
involved, as recently shown for miRNAs [55]. The view 
proposed by this model could speculatively match that of 
the proposed stochastic, "burst-like" nature of transcrip- 
tion [56], characterized by variable kinetics and refractory 
periods [57], producing therefore a noisy transcriptome 
which could be later shaped into a more stable proteome 
by translational control. 

Following this parsimonious model the observed degree 
of coupling between fluctuations of the mRNA levels in 
the cells and fluctuations in productive ribosome engage- 
ment should be an effect of specific mechanisms of mo- 
lecular pairing between mRNA steady state determinants 
(i.e., controls of chromatin remodeling, transcription and 
mRNA decay) and regulation of translation of the same 
mRNA. The prediction emerging from this study is that a 
variety of coupling mechanisms, some of which already 
described [58,59], should be active in mammalian cells to 
orchestrate cell and tissue primary programs. 

Conclusions 

Our study estimated the genome-wide correlation 
between changes in mRNA abundance and mRNA poly- 
somal loading in an unprecedentedly large collection of 
mammalian cells and tissues subjected to heterogeneous 
stimuli. From our results we conclude that the control of 
gene expression at the polysomal level is pervasive with 
no exceptions, and genes whose expression changes 
homodirectionally at the transcriptome and the transla- 
tome level represent a minority of those perturbed by the 
stimuli. From a statistic point of view the variations in the 
degree of mRNA polysomal loading are, on the whole, 
independent from variations in mRNA abundance. This 
independency is further extended to the cell activities 
inferred from the ontological analysis of transcriptome 
and translatome differentially expressed genes, with a clear 
tendency of translation-related genes to be controlled 
purely at the translational level without modifications in 
the levels of their transcripts. 

Methods 

EGF treatment of HeLa cells 

HeLa CCL-2 cells were cultured in DMEM supplemented 
with 10% FBS, 2mM glutamine, 100 units/ml penicillin, 
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and 100 mg/ml streptomycin at 37 °C, 5% C0 2 . Cells were 
seeded on adherent plates and serum starved for 12h with 
DMEM, 0.5% FBS, 2mM glutamine. Cells were treated for 
40 minutes with recombinant human Epidermal Growth 
Factor (EGF from RD Systems, Minneapolis) at the final 
concentration of 1 \ig ml" 1 . Cell lysates were collected 
before (t=0 min) and after (t=40 min) EGF treatment. For 
the total and polysomal RNA extraction, 3 x 105 cells/well 
(6 well-plates) and 1.5 x 106 cells/dish (10mm dishes) 
were seeded, respectively, in order to have the same 
concentration of cells and the same surface density on the 
dishes). All experiments were run in biological triplicates. 

Total RNA extraction 

Total RNA was extracted using the TRIZOL reagent 
according to the manufacturer's protocol. RNA was 
quantified using a spectrophotometer and its quality was 
checked by agarose gel electrophoresis and by the Agi- 
lent 2100 Bioanalyzer platform, following the manifac- 
turer s guidelines for sample preparation and analysis of 
data (Agilent 2100 Bioanalyzer 2100 Expert User's 
Guide). 

Polysomal RNA extraction 

Cells were washed once with phosphate buffer saline 
(PBS + cycloheximide 10 \ig ml" 1 ) and treated directly on 
the plate with 300 \d lysis buffer [10 mM NaCl, 10 mM 
MgCl 2 , 10 mM Tris-HCl, pH 7.5, 1% Triton X-100, 1% 
sodium deoxycholate, 0.2 U [A 1 RNase inhibitor (Fer- 
mentas), cycloheximide 10 \ig ml" 1 and 1 mM dithio- 
threitol] and transferred to an Eppendorf tube. After a 
few minute incubation on ice with occasional vortexing, 
the extracts were centrifuged for 5 min at 12,000 g at 4 °C. 
The supernatant was stored at -80 °C or loaded directly 
onto a 15-50% linear sucrose gradient containing 30 mM 
Tris-HCl, pH 7.5, 100 mM NaCl, 10 mM MgCl 2 , and cen- 
trifuged in an Sorvall rotor for 100 min at 180,000 g. Frac- 
tions (polysomal and subpolysomal) were collected 
monitoring the absorbance at 254 nm and treated directly 
with proteinase K. After phenol-chloroform extraction 
and isopropanol precipitation, polysomal RNA was resus- 
pended in 30 \A of water. RNA quality was assessed by 
agarose gel electrophoresis and by the Agilent 2100 Bioa- 
nalyzer platform. 

Quantitative real-time RT-PCR 

Reverse Transcription of RNA to produce cDNA was done 
on total and polysomal extracts with the Superscript® 
VILO™ cDNA Synthesis Kit (Invitrogen). TaqMan quan- 
titative real-time PCR was performed in a 10-|iL reaction 
with a KAPA PROBE FAST universal qPCR (Kapa Biosys- 
tems). Four genes were used as endogenous controls: 
ACTB, GADPH, HPRT1, TBP. The geometric mean of the 
four controls was used to calculate the AC T for twelve 



other genes: MFAP4, TSC22D2, GPM6A, PSAPL1, AG2, 
EGR1,PCIF1, EGR2, ZNF655, RPL27, SLC2A3, RPL10A . 
To compare gene expression before and after EGF, the 
AAC T method was used. All reactions were performed in 
3-9 technical replicates for each RNA purified from all the 
three biological replicates. TaqMan primers and probes 
used in analyses (purchased from Applied Biosystems) are 
listed in Additional file 1: Table SI. 

Microarray hybridization and scanning, data acquisition 
and analysis 

Total, polysomal and subpolysomal RNA were hybridized 
on the Agilent-014850 Whole Human Genome Micro- 
array 4x44K G4112F following the manifacturers proto- 
col. Hybridized microarray slides were scanned with an 
Agilent DNA Microarray Scanner G2505C. \im resolution 
with the manufacturers software (Agilent ScanControl 
8.1.3). The scanned TIFF images were analyzed numeric- 
ally and background corrected using the Agilent Feature 
Extraction Software version 10.7.7.1 according to the 
Agilent standard protocol GEl_107_Sep09. The output of 
Feature Extraction was analyzed with the R software 
environment for statistical computing (htpp://www.r-pro- 
ject.org/) and the Bioconductor library of biostatistical 
packages (http://www.bioconductor.org/). Low signal Agi- 
lent features (11,003), distinguished by a repeated "absent" 
detection call across the majority of the arrays in every con- 
dition, were filtered out from the analysis, leaving 30,075 
features corresponding to 15,258 HGNC genes. Signal 
intensities across arrays were normalized with the quantile 
normalization algorithm [60]. Signals intensities from 
probes associated with the same gene were averaged. DEGs 
were identified with the Rank Product method implemen- 
ted in the Bioconductor RankProd package (pfp< 0.2 as 
threshold). All microarray data are available through the 
Gene Expression Omnibus database (http://www.ncbi.nlm. 
nih.gov/geo/) using the accession number GSE20277. 

Western blotting 

Cells were lysed in Ripa lysis buffer (Tris 50 mM a pH 7.4, 
NaCl 150 mM, Igepal CA-630 1%, EDTA 1 mM, Na deoxy- 
cholate 0.5%) containing protease and phosphatase inhibi- 
tors (Sigma- Aldrich). Total cell extracts were diluted in 2X 
SDS protein gel loading solution, boiled for 5 min, sepa- 
rated on 12% SDS-polyacrylamide gel electrophoresis 
(SDS-PAGE) and processed following standard procedures. 
The goat polyclonal antibody anti-phospo-eIF4E (Santa 
Cruz Biotechnology, Santa Cruz, CA) was diluted at 1:500, 
the rabbit anti-phospho-Akt (Cell Signaling Technology, 
Danrers, MA) at 1:1000, the goat anti-beta-actin (Santa 
Cruz Biotechnology, Santa Cruz, CA) at 1:1000 and the 
rabbit anti-Myc (Cell Signaling Technology, Danrers, MA) 
at 1:1000. The nitrocellulose membrane signals were 
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detected by chemiluminescence. Experiments were per- 
formed at least three times for each cell preparation. 

Ontological analysis of DEGs 

The DAVID resource [61] was used for gene-annotation 
enrichment analysis of the transcriptome and the transla- 
tome DEG lists with categories from the following 
resources: PIR (http://pir.georgetown.edu/), Gene Ontology 
(http://www.thegeneontology.org), KEGG (http://www. 
genome.jp/kegg/) and Biocarta (http://www.biocarta. 
com/default.aspx) pathway databases, PFAM (http:// 
pfam.sanger.ac.uk/) and COG (http://www.ncbi.nlm.nih. 
gov/COG/) databases. The significance of overrepresen- 
tation was determined at a false discovery rate of 5% 
with Benjamini multiple testing correction. Matched 
annotations were used to estimate the uncoupling of 
functional information as the proportion of annotations 
overrepresented in the translatome but not in the tran- 
scriptome readings and vice versa. 

Data collection, pre-processing and identification of 
differentially expressed genes (DEGs) 

High-throughput data on global changes at the tran- 
scriptome and translatome levels were gathered from 
public data repositories: Gene Expression Omnibus 
(http://www.ncbi.nlm.nih.gov/geo/), ArrayExpress (http:// 
www.ebi.ac.uk/microarray-as/ae/), Stanford Microarray 
Database (http://smd.stanford.edu/). Minimum require- 
ments we established for datasets to be included in our 
analysis were: full access to raw data, hybridization 
replicas for every experimental condition, two-group 
comparison (treated group vs. control group) for both 
transcriptome and translatome. Selected datasets are 
detailed in Table 1 and Additional file 4. Raw data were 
treated following the same procedure described in the 
previous section to determine DEGs in either the tran- 
scriptome or the translatome. Additionally, £-test and 
SAM were used as alternative DEGs selection methods 
applying a Benjamini Hochberg multiple test correction 
to the resulting p-values. 

Pathway and network analysis with IPA 

The IPA software (Ingenuity Systems, http://www. 
ingenuity.com) was used to assess the involvement of 
transcriptome and translatome differentially expressed 
genes in known pathways and networks. IPA uses the 
Fisher exact test to determine the enrichment of DEGs 
in canonical pathways. Pathways with a Bonferroni- 
Hochberg corrected p-value < 0.05 were considered signifi- 
cantly over-represented. IPA also generates gene networks 
by using experimentally validated direct interactions stored 
in the Ingenuity Knowledge Base. The networks generated 
by IPA have a maximum size of 35 genes, and they receive 
a score indicating the likelihood of the DEGs to be found 



together in the same network due to chance. IPA networks 
were generated from transcriptome and translatome DEGs 
of each dataset. A score of 4, used as a threshold for identi- 
fying significant gene networks, indicates that there is only 
a 1/10000 probability that the presence of DEGs in the 
same network is due to random chance. Each significant 
network is associated by IPA to three cellular functions, 
based on the functional annotation of the genes in the net- 
work. For each cellular function, the number of associated 
transcriptome networks and the number of associated 
translatome networks across all the datasets was calculated. 
For each function, a translatome network specificity degree 
was calculated as the number of associated translatome 
networks minus the number of associated transcriptome 
networks, divided by the total number of associated net- 
works. Only cellular functions with more than five asso- 
ciated networks were considered. 

Semantic similarity 

To accurately measure the semantic transcriptome- 
to-translatome similarity, we also adopted a measure of 
semantic similarity that takes into account the contribu- 
tion of semantically similar terms besides the identical 
ones. We chose the graph theoretical approach [62] 
because it depends only on the structuring rules 
describing the relationships between the terms in the 
ontology in order to quantify the semantic value of each 
term to be compared. Thus, this approach is free from 
gene annotation biases affecting other similarity measures. 
Being also specifically interested in distinguishing between 
the transcriptome specificity and the translatome specifi- 
city, we separately computed these two contributions to 
the proposed semantic similarity measure. In this way the 
semantic translatome specificity is defined as 1 minus the 
averaged maximal similarities between each term in the 
translatome list with any term in the transcriptome list; 
similarly, the semantic transcriptome specificity is defined 
as 1 minus the averaged maximal similarities between 
each term in the transcriptome list and any term in the 
translatome list. Given a list of m translatome terms 
and a list of n transcriptome terms, semantic transla- 
tome specificity and semantic transcriptome specificity 
are therefore defined as: 

Ei</<« max(sem.sim.ij) 

semantic -translatome specificity =1 ;< 

m 

semantic -transcriptome specificity =1 < j< m 

n 

where sem.sim. is the semantic similarity between two GO 
terms. Both transcriptome specificity and translatome spe- 
cificity range from 0 (no specificity) to 1 (full specificity). 
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Calculation of the semantic transcriptome Vs translatome 
specificity degree associated to GOslim terms 

For each GO term the transcriptome specificity degree 
is calculated as the ratio between the number of data- 
sets in which it is transcriptome specifically over- 
represented and the number of datasets in which it is 
over-represented, while the translatome specificity 
degree is calculated as the ratio between the number 
of datasets in which it is translatome specifically over- 
represented and the number of datasets in which it is over- 
represented. According to the GO structure, terms are 
grouped into the parental GOslim categories and the me- 
dian transcriptome and translatome specificity degrees are 
calculated. Within each of the three GO domains, categor- 
ies were sorted from the most transcriptome specific to the 
most translatome specific by subtracting the transcriptome 
specificity degree from the translatome specificity degree. 

Likelihood ratio test 

The Likelihood Ratio test was used to test the null 
hypothesis that DEG numbers are the same between 
transcriptome and translatome, against the alternative 
hypothesis that they can be different. 

Random overlap test 

For each dataset, nl and n2 genes were randomly 
extracted from the population of DEGs (nl and n2 being 
the real numbers of observed transcriptome and transla- 
tome DEGs for the dataset). The number of common 
genes was calculated as the random overlap and the ex- 
traction process was repeated 1 million times. The over- 
lap test calculates the probability of the observed overlap 
to be higher than the random overlap. 

Mutual information test 

Mutual information is used in each dataset to measure the 
mutual dependence between being a transcriptome DEGs 
and being a translatome DEG. Each of the two variables is 
discrete, taking the value of 1 if the gene is differentially 
expressed, 0 if the gene is not differentially expressed. 
Minimal mutual information for each dataset is calculated 
as the case in which the two lists of nl transcriptome 
DEGs and n2 translatome DEGs have null overlap. Max- 
imal mutual information is calculated as the case in which 
the two lists of DEGs are completely overlapping and have 
size (nl + n2)/2. Random mutual information is calculated 
for each dataset from one million of random extractions, 
similarly as described in the previous section. The mutual 
information test calculates the probability of the observed 
mutual information to be higher than the random mutual 
information. 
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